AITopics

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)

Neural Information Processing SystemsFeb-10-2026, 04:25:14 GMT

Supplementary Materials for Incomplete Multimodality-Diffused Emotion Recognition

In this supplementary material, we first present the details of the conditional score network in Sec. 2. Sec. 4. Finally, we conduct experiments on Chinese MER dataset CH-SIMS [ I) which is subsequently fixed for the model (i.e., not learnable). Table 1: Hyperparameter settings in IMDer.Hyperparameter CMU-MOSI CMU-MOSEI Optimizer Adam Adam Batch size 32 128 Learning rate 0.001 0.002 σ used in our stochastic differential equation 25 25 Number of iterations for Euler-Maruyama solver 500 500 Shallow Feature Extractor Kernel size for E CH-SIMS contains 2281 refined video segments with fine-grained annotations of modalities. For vision modality, we use MultiComp OpenFace2.0 The experimental results are listed in the Tab. 3. Obviously, our proposed IMDer consistently achieves better results than MMIN or GCNet under random missing protocol.

artificial intelligence, machine learning, modality, (15 more...)

Country: Asia > China > Jiangsu Province > Nanjing (0.05)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Neural Information Processing SystemsFeb-10-2026, 04:25:11 GMT

372cb7805eaccb2b7eed641271a30eec-Paper-Conference.pdf

available modality, modality, recognition, (14 more...)

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Shandong Province (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.84)
(2 more...)

Neural Information Processing SystemsOct-9-2025, 04:24:34 GMT

abb4847bbd60f38b1b7649d26c7a0067-Paper-Conference.pdf

artificial intelligence, machine learning, modality, (15 more...)

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.96)

Neural Information Processing SystemsOct-8-2025, 11:03:10 GMT

372cb7805eaccb2b7eed641271a30eec-Supplemental-Conference.pdf

convolutional layer, modality, proceedings, (15 more...)

Country: Asia > China > Jiangsu Province > Nanjing (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.71)
Information Technology > Artificial Intelligence > Vision (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Neural Information Processing SystemsOct-8-2025, 11:03:07 GMT

372cb7805eaccb2b7eed641271a30eec-Paper-Conference.pdf

available modality, modality, recognition, (14 more...)

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Shandong Province (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)

arXiv.org Artificial IntelligenceSep-25-2025

TriSPrompt: A Hierarchical Soft Prompt Model for Multimodal Rumor Detection with Incomplete Modalities

Chen, Jiajun, Wu, Yangyang, Miao, Xiaoye, Zhu, Mengying, Xi, Meng

The widespread presence of incomplete modalities in multimodal data poses a significant challenge to achieving accurate rumor detection. Existing multimodal rumor detection methods primarily focus on learning joint modality representations from \emph{complete} multimodal training data, rendering them ineffective in addressing the common occurrence of \emph{missing modalities} in real-world scenarios. In this paper, we propose a hierarchical soft prompt model \textsf{TriSPrompt}, which integrates three types of prompts, \textit{i.e.}, \emph{modality-aware} (MA) prompt, \emph{modality-missing} (MM) prompt, and \emph{mutual-views} (MV) prompt, to effectively detect rumors in incomplete multimodal data. The MA prompt captures both heterogeneous information from specific modalities and homogeneous features from available data, aiding in modality recovery. The MM prompt models missing states in incomplete data, enhancing the model's adaptability to missing information. The MV prompt learns relationships between subjective (\textit{i.e.}, text and image) and objective (\textit{i.e.}, comments) perspectives, effectively detecting rumors. Extensive experiments on three real-world benchmarks demonstrate that \textsf{TriSPrompt} achieves an accuracy gain of over 13\% compared to state-of-the-art methods. The codes and datasets are available at https: //anonymous.4open.science/r/code-3E88.

data mining, machine learning, natural language, (21 more...)

2509.19352

Country:

Asia > China (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Industry: Media > News (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
(2 more...)

Wang, Xiaoyang, Yang, Christopher C.

MoE-Health: A Mixture of Experts Framework for Robust Multimodal Healthcare Prediction

arXiv.org Artificial IntelligenceSep-1-2025

Healthcare systems generate diverse multimodal data, including Electronic Health Records (EHR), clinical notes, and medical images. Effectively leveraging this data for clinical prediction is challenging, particularly as real-world samples often present with varied or incomplete modalities. Existing approaches typically require complete modality data or rely on manual selection strategies, limiting their applicability in real-world clinical settings where data availability varies across patients and institutions. To address these limitations, we propose MoE-Health, a novel Mixture of Experts framework designed for robust multimodal fusion in healthcare prediction. MoE-Health architecture is specifically developed to handle samples with differing modalities and improve performance on critical clinical tasks. By leveraging specialized expert networks and a dynamic gating mechanism, our approach dynamically selects and combines relevant experts based on available data modalities, enabling flexible adaptation to varying data availability scenarios. We evaluate MoE-Health on the MIMIC-IV dataset across three critical clinical prediction tasks: in-hospital mortality prediction, long length of stay, and hospital readmission prediction. Experimental results demonstrate that MoE-Health achieves superior performance compared to existing multimodal fusion methods while maintaining robustness across different modality availability patterns. The framework effectively integrates multimodal information, offering improved predictive performance and robustness in handling heterogeneous and incomplete healthcare data, making it particularly suitable for deployment in diverse healthcare environments with heterogeneous data availability.

artificial intelligence, data mining, machine learning, (21 more...)

2508.21793

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.89)
Information Technology > Information Management (0.88)

arXiv.org Artificial IntelligenceMay-26-2025

RoHyDR: Robust Hybrid Diffusion Recovery for Incomplete Multimodal Emotion Recognition

Jin, Yuehan, Liu, Xiaoqing, Yang, Yiyuan, Yu, Zhiwen, Zhang, Tong, Yang, Kaixiang

Multimodal emotion recognition analyzes emotions by combining data from multiple sources. However, real-world noise or sensor failures often cause missing or corrupted data, creating the Incomplete Multimodal Emotion Recognition (IMER) challenge. In this paper, we propose Robust Hybrid Diffusion Recovery (RoHyDR), a novel framework that performs missing-modality recovery at unimodal, multimodal, feature, and semantic levels. For unimodal representation recovery of missing modalities, RoHyDR exploits a diffusion-based generator to generate distribution-consistent and semantically aligned representations from Gaussian noise, using available modalities as conditioning. For multimodal fusion recovery, we introduce adversarial learning to produce a realistic fused multimodal representation and recover missing semantic content. We further propose a multi-stage optimization strategy that enhances training stability and efficiency. In contrast to previous work, the hybrid diffusion and adversarial learning-based recovery mechanism in RoHyDR allows recovery of missing information in both unimodal representation and multimodal fusion, at both feature and semantic levels, effectively mitigating performance degradation caused by suboptimal optimization. Comprehensive experiments conducted on two widely used multimodal emotion recognition benchmarks demonstrate that our proposed method outperforms state-of-the-art IMER methods, achieving robust recognition performance under various missing-modality scenarios. Our code will be made publicly available upon acceptance.

artificial intelligence, machine learning, natural language, (15 more...)

2505.17501

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

arXiv.org Artificial IntelligenceMar-10-2025

Knowledge Bridger: Towards Training-free Missing Multi-modality Completion

Ke, Guanzhou, He, Shengfeng, Wang, Xiao Li, Wang, Bo, Chao, Guoqing, Zhang, Yuanyang, Xie, Yi, Su, HeXing

Previous successful approaches to missing modality completion rely on carefully designed fusion techniques and extensive pre-training on complete data, which can limit their generalizability in out-of-domain (OOD) scenarios. In this study, we pose a new challenge: can we develop a missing modality completion model that is both resource-efficient and robust to OOD generalization? To address this, we present a training-free framework for missing modality completion that leverages large multimodal models (LMMs). Our approach, termed the "Knowledge Bridger", is modality-agnostic and integrates generation and ranking of missing modalities. By defining domain-specific priors, our method automatically extracts structured information from available modalities to construct knowledge graphs. These extracted graphs connect the missing modality generation and ranking modules through the LMM, resulting in high-quality imputations of missing modalities. Experimental results across both general and medical domains show that our approach consistently outperforms competing methods, including in OOD generalization. Additionally, our knowledge-driven generation and ranking techniques demonstrate superiority over variants that directly employ LMMs for generation and ranking, offering insights that may be valuable for applications in other domains.

knowledge, lmm, modality, (15 more...)